[fix](fe) Prevent cast project pushdown through union distinct#64080
[fix](fe) Prevent cast project pushdown through union distinct#64080morrySnow wants to merge 2 commits into
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
/review |
|
run buildall |
FE UT Coverage ReportIncrement line coverage |
There was a problem hiding this comment.
Review result: no blocking issues found in this PR.
Critical checkpoint conclusions:
- Goal and proof: The PR prevents non-injective cast projections from being pushed below UNION DISTINCT, preserving distinctness before the outer cast. The added unit guard and regression query cover the reported datetime-to-date collision scenario.
- Scope and clarity: The code change is small and focused on the existing canPushProject gate. It does not introduce a new rule or alter rule registration.
- Concurrency and lifecycle: Not applicable; this is a deterministic optimizer rewrite predicate and test update with no shared mutable state, locks, threads, static initialization, or lifecycle-sensitive resources.
- Configuration and compatibility: No new configs, persistence formats, FE-BE protocol fields, or storage compatibility concerns.
- Parallel paths: The rewrite is attached to LogicalSetOperation, but SQL analysis rejects INTERSECT ALL and EXCEPT ALL, and the guarded cast path remains appropriate for UNION ALL while DISTINCT set operations now reject non-slot projections.
- Conditions and error handling: The new qualifier condition is explicit and conservative. No Status/exception handling paths are changed.
- Tests and expected results: The unit test checks ALL vs DISTINCT behavior, and the regression test validates the end-to-end duplicate-preserving result for UNION DISTINCT followed by a lossy cast. The output is deterministic via order_qt.
- Observability, transactions, memory: Not applicable for this optimizer-only change.
- Performance: The change may skip some previously allowed harmless alias/cast pushdowns through DISTINCT, but this is an acceptable conservative tradeoff for correctness.
User focus: No additional user-provided review focus was supplied.
Verification performed: git diff --check passed. I attempted ./run-fe-ut.sh --run org.apache.doris.nereids.rules.rewrite.PushProjectThroughUnionTest, but this runner is missing thirdparty/installed/bin/protoc, so generated-source setup failed before tests could run.
FE Regression Coverage ReportIncrement line coverage |
TPC-H: Total hot run time: 29308 ms |
TPC-DS: Total hot run time: 170262 ms |
|
/review |
|
run buildall |
1cc69e1 to
0b76718
Compare
|
run buildall |
|
/review |
There was a problem hiding this comment.
I found one correctness issue that can still make PushProjectThroughUnion change UNION DISTINCT results.
Checkpoint conclusions:
- Goal/test: The PR addresses the intended datetime-to-date regression and adds unit/regression coverage, but coverage misses complex-to-character casts with non-injective nested formatting.
- Scope/focus: The rewrite change is focused, but the new safe-cast classification is broader than the proof needed by the rule.
- Concurrency/lifecycle/config/compatibility/persistence/writes: Not applicable; this is an optimizer rewrite/type-classification change.
- Parallel paths: UNION ALL remains unchanged; UNION DISTINCT now uses the new classification, but complex-to-character paths need the same injectivity standard as primitive casts.
- Test result review: The added regression output is deterministic and matches the intended case. Additional negative coverage is needed for complex values containing float/double before this is safe.
- Observability/performance: Not applicable beyond normal rewrite behavior.
- User focus: No additional user-provided focus points were supplied.
Please fix the complex-to-character safe-cast classification before merging.
| public boolean canSafetyCastTo(DataType target) { | ||
| if (target instanceof ArrayType) { | ||
| return itemType.canSafetyCastTo(((ArrayType) target).itemType); | ||
| } |
There was a problem hiding this comment.
canSafetyCastTo is now used to decide whether UNION DISTINCT duplicate elimination may be moved after the cast, so true here must mean the cast is injective. Complex-to-character casts are not always injective: for arrays/maps/structs containing FLOAT or DOUBLE, direct float/double-to-string is intentionally not marked safe, but complex stringification delegates to the nested serde (for arrays, DataTypeArraySerDe::to_string calls nested_serde->to_string) and BE formats float/double with only digits10 + 1 significant digits in cast_to_string.h. Two distinct arrays whose float elements differ only beyond that formatted precision can therefore stringify to the same value; after this rewrite the pushed cast would make UNION DISTINCT collapse a row that the original plan kept. The same unconditional target instanceof CharacterType pattern in MapType and StructType has the same issue. Please either leave complex-to-character casts unsafe or require the nested types themselves to be injective when cast to the character target.
There was a problem hiding this comment.
I found one blocking correctness issue in the new safe-cast model. The PR goal is clear: allow project/cast pushdown through UNION DISTINCT only when the cast preserves distinctness. The implementation mostly follows the existing rule shape and has focused FE unit/regression coverage for primitive datetime/integer cases, but the complex-to-string cases currently marked safe are not distinctness-preserving and can change query results.
Critical checkpoint conclusions:
- Goal/test: The goal is partially achieved for tested primitive cases, but complex-to-string safe-cast coverage violates the core UNION DISTINCT invariant; existing tests assert these cases safe instead of covering a collision.
- Scope/clarity: The change is focused, but the new canSafetyCastTo contract needs stricter evidence before broad type-family allowances.
- Concurrency/lifecycle/config/compatibility/observability/transactions/data writes: Not applicable; this is an optimizer rewrite/type-semantics change.
- Parallel paths: The rule is used both directly and via LogicalSetOperation project processing; the unsafe predicate affects both paths.
- Test coverage/results: Added tests cover the original datetime regression and some type predicates, but miss non-injective complex stringification.
- Performance: No new performance issue found.
User focus: No additional user-provided review focus was specified.
TPC-H: Total hot run time: 29503 ms |
TPC-DS: Total hot run time: 169465 ms |
FE Regression Coverage ReportIncrement line coverage |
FE UT Coverage ReportIncrement line coverage |
### What problem does this PR solve?
Issue Number: close #xxx
Related PR: #xxx
Problem Summary: PushProjectThroughUnion allowed cast projections to be pushed below UNION DISTINCT. For non-injective casts, distinct was then evaluated after the cast instead of before the cast, which could incorrectly collapse rows that were distinct in the original union output. Restrict cast project pushdown to UNION ALL and add tests for the qualifier guard and the DORIS-26135 datetime-to-date regression case.
### Release note
None
### Check List (For Author)
- Test: Regression test / Unit Test
- ./run-fe-ut.sh --run org.apache.doris.nereids.rules.rewrite.PushProjectThroughUnionTest
- ./run-regression-test.sh --run -d nereids_syntax_p0 -s set_operation
- mvn checkstyle:check -pl fe-core
- Behavior changed: No
- Does this need documentation: No
0b76718 to
b3c474c
Compare
|
run buildall |
|
/review |
There was a problem hiding this comment.
I found one remaining data-correctness issue in the new injective-cast predicate. The main rule change itself is focused and the datetime-to-date UNION DISTINCT regression covers one important unsafe cast, but JSONB-to-character is still marked safe even though the BE cast collapses distinct JSON values.
Critical checkpoint conclusions:
- Goal/test: The goal is to prevent pushing non-injective casts below UNION DISTINCT; tests cover datetime-to-date and several predicate cases, but not JSONB-to-string.
- Scope: The implementation is small and localized to the rewrite predicate plus type methods.
- Concurrency/lifecycle/config/incompatibility: No concurrency, lifecycle, config, persistence, storage format, or FE-BE protocol compatibility concerns found in this PR.
- Parallel paths: UNION ALL remains intentionally permissive; UNION DISTINCT now uses the new predicate, but the predicate still has an unsafe JSONB character path.
- Test coverage/results: Added unit/regression tests are relevant, but missing coverage for JSONB string-vs-number collisions. Existing review threads already cover complex-to-character serialization issues, so I did not duplicate them.
- Observability/performance: No new observability requirement; no material performance concern found in the reviewed path.
User focus: No additional user-provided review focus was specified.
FE UT Coverage ReportIncrement line coverage |
FE Regression Coverage ReportIncrement line coverage |
TPC-H: Total hot run time: 29672 ms |
TPC-DS: Total hot run time: 170047 ms |
FE Regression Coverage ReportIncrement line coverage |
|
PR approved by at least one committer and no changes requested. |
Issue Number: close #xxx
Related PR: #xxx
Problem Summary: PushProjectThroughUnion can safely push casts through UNION DISTINCT only when the cast preserves distinctness. Add DataType.canSafetyCastTo coverage for primitive and complex types, fix unsafe or incomplete edge cases found during review, treat string-like casts as length-insensitive, and add unit tests for the new interface and pushdown qualifier behavior.
None
- Test: Unit Test
- ./run-fe-ut.sh --run org.apache.doris.nereids.types.DataTypeTest,org.apache.doris.nereids.rules.rewrite.PushProjectThroughUnionTest
- ./run-fe-ut.sh --run org.apache.doris.nereids.types.DataTypeTest
- mvn checkstyle:check -pl fe-core
- Behavior changed: No
- Does this need documentation: No
b3c474c to
06d2fea
Compare
|
run buildall |
|
/review |
There was a problem hiding this comment.
I found one new blocking correctness issue in the DISTINCT set-operation guard.
Critical checkpoint conclusions:
- Goal/test: The PR aims to prevent project pushdown from changing set-operation distinctness. The new tests cover a single-column non-injective cast through UNION DISTINCT, but they do not cover projections that duplicate/drop set-op output slots; that path can still change results.
- Scope/focus: The implementation is small and focused, but the safety check is only per expression and is not sufficient for the whole projection tuple.
- Concurrency/lifecycle/config/compatibility/transactions/FE-BE protocol/observability: Not applicable for this FE optimizer metadata/rewrite change.
- Parallel paths: The rule is attached to LogicalSetOperation. INTERSECT/EXCEPT ALL are rejected during binding, so the reachable ALL fast path is UNION ALL; DISTINCT set ops rely on the same incomplete projection guard.
- Tests/results: Existing new regression output is deterministic, but coverage is missing for duplicate-slot/permutation cases under DISTINCT.
- Performance: No new performance issue found.
- User focus: No additional user-provided review focus was supplied.
I did not repeat the already-open inline threads about JSONB/complex-to-character cast injectivity.
| Expression expr; | ||
| if (logicalSetOperation.getQualifier().equals(Qualifier.ALL)) { | ||
| expr = ExpressionUtils.getExpressionCoveredByCast(e.child(0)); | ||
| } else { |
There was a problem hiding this comment.
This still lets UNION DISTINCT run after a projection that is not injective as a tuple. The check proves each expression is a slot or an injective cast of one slot, but it never verifies that the referenced slots form a permutation of the set-op output. For example:
select cast(a as bigint), cast(a as bigint)
from (select 1 a, 2 b union select 1, 3) t;Before the rewrite, the inner union keeps two distinct (a,b) rows and the outer project returns two (1,1) rows. After this rule pushes the project into both children, the union distinct sees only (1,1) and collapses them to one row. For non-UNION ALL set ops, please require the stripped slots to cover every output slot exactly once (or otherwise prove the whole projection vector is injective), not just each expression independently.
TPC-H: Total hot run time: 29594 ms |
TPC-DS: Total hot run time: 169669 ms |
What problem does this PR solve?
Issue Number: close #xxx
Related PR: #xxx
Problem Summary: PushProjectThroughUnion previously pushed cast projections below UNION DISTINCT without checking whether the cast preserved distinctness. For non-injective casts, distinct was then evaluated after the cast instead of before the cast, which could incorrectly collapse rows that were distinct in the original union output. This PR adds DataType.canSafetyCastTo to model casts that are safe for this rewrite, uses it to allow cast pushdown through UNION DISTINCT only when the cast is distinctness-preserving, and keeps UNION ALL pushdown unchanged. The implementation also covers primitive and complex types, including length-insensitive string-like casts, and fixes edge cases found during review for integral, map, and struct types.
Release note
None
Check List (For Author)